Computer vision - Lab 7¶

Agenda¶

Image segmentation based on:

  • thresholding
  • cluster analysis,
  • detecting image features (e.g. edges),
  • region growing,

Helpers¶

In [1]:
%matplotlib inline
import glob

import cv2
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import PIL
import plotly.graph_objects as go
from IPython.display import HTML, display
from matplotlib.colors import ListedColormap
from pandas import DataFrame
from sklearn.mixture import GaussianMixture

pd.options.display.html.border = 0
pd.options.display.float_format = '{:,.2f}'.format

Images¶

  • Image Lenna (available at the link) - one of the most popular images historically used for testing image processing and compression,
  • clevr -comes from the CLEVR dataset that deals with the Visual Query Answering problem,
  • graf - sample graffiti image from the OpenCV repository OpenCV,
  • sudoku - sample sudoku image from OpenCV repository,
  • skittles - several images containing skittles
In [2]:
# download images
!wget -O lena_std.tif http://www.lenna.org/lena_std.tif
!wget -O clevr.jpg https://cs.stanford.edu/people/jcjohns/clevr/teaser.jpg
!wget -O graf.png https://github.com/opencv/opencv/raw/master/samples/data/graf1.png
!wget -O sudoku.png https://raw.githubusercontent.com/opencv/opencv/master/samples/data/sudoku.png

for i in range(100, 111):
    !wget -O skittles{i}.jpg https://github.com/possibly-wrong/skittles/blob/master/images/{i}.jpg?raw=true
--2024-11-24 20:14:44--  http://www.lenna.org/lena_std.tif
Resolving www.lenna.org (www.lenna.org)... 107.180.37.106
Connecting to www.lenna.org (www.lenna.org)|107.180.37.106|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 786572 (768K) [image/tiff]
Saving to: ‘lena_std.tif’


lena_std.tif          0%[                    ]       0  --.-KB/s               
lena_std.tif        100%[===================>] 768.14K  --.-KB/s    in 0.08s   

2024-11-24 20:14:44 (9.07 MB/s) - ‘lena_std.tif’ saved [786572/786572]

--2024-11-24 20:14:44--  https://cs.stanford.edu/people/jcjohns/clevr/teaser.jpg
Resolving cs.stanford.edu (cs.stanford.edu)... 171.64.64.64
Connecting to cs.stanford.edu (cs.stanford.edu)|171.64.64.64|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 142143 (139K) [image/jpeg]
Saving to: ‘clevr.jpg’

clevr.jpg           100%[===================>] 138.81K  --.-KB/s    in 0.06s   

2024-11-24 20:14:44 (2.16 MB/s) - ‘clevr.jpg’ saved [142143/142143]

--2024-11-24 20:14:44--  https://github.com/opencv/opencv/raw/master/samples/data/graf1.png
Resolving github.com (github.com)... 140.82.113.4
Connecting to github.com (github.com)|140.82.113.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/opencv/opencv/master/samples/data/graf1.png [following]
--2024-11-24 20:14:45--  https://raw.githubusercontent.com/opencv/opencv/master/samples/data/graf1.png
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.109.133, 185.199.111.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 951440 (929K) [image/png]
Saving to: ‘graf.png’

graf.png            100%[===================>] 929.14K  --.-KB/s    in 0.03s   

2024-11-24 20:14:45 (26.9 MB/s) - ‘graf.png’ saved [951440/951440]

--2024-11-24 20:14:45--  https://raw.githubusercontent.com/opencv/opencv/master/samples/data/sudoku.png
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 250720 (245K) [image/png]
Saving to: ‘sudoku.png’

sudoku.png          100%[===================>] 244.84K  --.-KB/s    in 0.02s   

2024-11-24 20:14:46 (10.1 MB/s) - ‘sudoku.png’ saved [250720/250720]

--2024-11-24 20:14:46--  https://github.com/possibly-wrong/skittles/blob/master/images/100.jpg?raw=true
Resolving github.com (github.com)... 140.82.114.4
Connecting to github.com (github.com)|140.82.114.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://github.com/possibly-wrong/skittles/raw/refs/heads/master/images/100.jpg [following]
--2024-11-24 20:14:46--  https://github.com/possibly-wrong/skittles/raw/refs/heads/master/images/100.jpg
Reusing existing connection to github.com:443.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/possibly-wrong/skittles/refs/heads/master/images/100.jpg [following]
--2024-11-24 20:14:46--  https://raw.githubusercontent.com/possibly-wrong/skittles/refs/heads/master/images/100.jpg
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 60862 (59K) [image/jpeg]
Saving to: ‘skittles100.jpg’

skittles100.jpg     100%[===================>]  59.44K  --.-KB/s    in 0.009s  

2024-11-24 20:14:46 (6.15 MB/s) - ‘skittles100.jpg’ saved [60862/60862]

--2024-11-24 20:14:47--  https://github.com/possibly-wrong/skittles/blob/master/images/101.jpg?raw=true
Resolving github.com (github.com)... 140.82.114.4
Connecting to github.com (github.com)|140.82.114.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://github.com/possibly-wrong/skittles/raw/refs/heads/master/images/101.jpg [following]
--2024-11-24 20:14:47--  https://github.com/possibly-wrong/skittles/raw/refs/heads/master/images/101.jpg
Reusing existing connection to github.com:443.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/possibly-wrong/skittles/refs/heads/master/images/101.jpg [following]
--2024-11-24 20:14:47--  https://raw.githubusercontent.com/possibly-wrong/skittles/refs/heads/master/images/101.jpg
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 64264 (63K) [image/jpeg]
Saving to: ‘skittles101.jpg’

skittles101.jpg     100%[===================>]  62.76K  --.-KB/s    in 0.01s   

2024-11-24 20:14:47 (4.46 MB/s) - ‘skittles101.jpg’ saved [64264/64264]

--2024-11-24 20:14:47--  https://github.com/possibly-wrong/skittles/blob/master/images/102.jpg?raw=true
Resolving github.com (github.com)... 140.82.112.3
Connecting to github.com (github.com)|140.82.112.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://github.com/possibly-wrong/skittles/raw/refs/heads/master/images/102.jpg [following]
--2024-11-24 20:14:48--  https://github.com/possibly-wrong/skittles/raw/refs/heads/master/images/102.jpg
Reusing existing connection to github.com:443.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/possibly-wrong/skittles/refs/heads/master/images/102.jpg [following]
--2024-11-24 20:14:48--  https://raw.githubusercontent.com/possibly-wrong/skittles/refs/heads/master/images/102.jpg
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 62790 (61K) [image/jpeg]
Saving to: ‘skittles102.jpg’

skittles102.jpg     100%[===================>]  61.32K  --.-KB/s    in 0.01s   

2024-11-24 20:14:48 (4.41 MB/s) - ‘skittles102.jpg’ saved [62790/62790]

--2024-11-24 20:14:48--  https://github.com/possibly-wrong/skittles/blob/master/images/103.jpg?raw=true
Resolving github.com (github.com)... 140.82.113.3
Connecting to github.com (github.com)|140.82.113.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://github.com/possibly-wrong/skittles/raw/refs/heads/master/images/103.jpg [following]
--2024-11-24 20:14:48--  https://github.com/possibly-wrong/skittles/raw/refs/heads/master/images/103.jpg
Reusing existing connection to github.com:443.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/possibly-wrong/skittles/refs/heads/master/images/103.jpg [following]
--2024-11-24 20:14:49--  https://raw.githubusercontent.com/possibly-wrong/skittles/refs/heads/master/images/103.jpg
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.108.133, 185.199.109.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 62793 (61K) [image/jpeg]
Saving to: ‘skittles103.jpg’

skittles103.jpg     100%[===================>]  61.32K  --.-KB/s    in 0.01s   

2024-11-24 20:14:49 (5.35 MB/s) - ‘skittles103.jpg’ saved [62793/62793]

--2024-11-24 20:14:49--  https://github.com/possibly-wrong/skittles/blob/master/images/104.jpg?raw=true
Resolving github.com (github.com)... 140.82.112.3
Connecting to github.com (github.com)|140.82.112.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://github.com/possibly-wrong/skittles/raw/refs/heads/master/images/104.jpg [following]
--2024-11-24 20:14:49--  https://github.com/possibly-wrong/skittles/raw/refs/heads/master/images/104.jpg
Reusing existing connection to github.com:443.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/possibly-wrong/skittles/refs/heads/master/images/104.jpg [following]
--2024-11-24 20:14:49--  https://raw.githubusercontent.com/possibly-wrong/skittles/refs/heads/master/images/104.jpg
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.108.133, 185.199.109.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 63203 (62K) [image/jpeg]
Saving to: ‘skittles104.jpg’

skittles104.jpg     100%[===================>]  61.72K  --.-KB/s    in 0.02s   

2024-11-24 20:14:50 (3.94 MB/s) - ‘skittles104.jpg’ saved [63203/63203]

--2024-11-24 20:14:50--  https://github.com/possibly-wrong/skittles/blob/master/images/105.jpg?raw=true
Resolving github.com (github.com)... 140.82.112.3
Connecting to github.com (github.com)|140.82.112.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://github.com/possibly-wrong/skittles/raw/refs/heads/master/images/105.jpg [following]
--2024-11-24 20:14:50--  https://github.com/possibly-wrong/skittles/raw/refs/heads/master/images/105.jpg
Reusing existing connection to github.com:443.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/possibly-wrong/skittles/refs/heads/master/images/105.jpg [following]
--2024-11-24 20:14:50--  https://raw.githubusercontent.com/possibly-wrong/skittles/refs/heads/master/images/105.jpg
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 62741 (61K) [image/jpeg]
Saving to: ‘skittles105.jpg’

skittles105.jpg     100%[===================>]  61.27K  --.-KB/s    in 0.01s   

2024-11-24 20:14:50 (4.54 MB/s) - ‘skittles105.jpg’ saved [62741/62741]

--2024-11-24 20:14:50--  https://github.com/possibly-wrong/skittles/blob/master/images/106.jpg?raw=true
Resolving github.com (github.com)... 140.82.114.4
Connecting to github.com (github.com)|140.82.114.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://github.com/possibly-wrong/skittles/raw/refs/heads/master/images/106.jpg [following]
--2024-11-24 20:14:51--  https://github.com/possibly-wrong/skittles/raw/refs/heads/master/images/106.jpg
Reusing existing connection to github.com:443.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/possibly-wrong/skittles/refs/heads/master/images/106.jpg [following]
--2024-11-24 20:14:51--  https://raw.githubusercontent.com/possibly-wrong/skittles/refs/heads/master/images/106.jpg
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.108.133, 185.199.109.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 64047 (63K) [image/jpeg]
Saving to: ‘skittles106.jpg’

skittles106.jpg     100%[===================>]  62.55K  --.-KB/s    in 0.01s   

2024-11-24 20:14:51 (4.51 MB/s) - ‘skittles106.jpg’ saved [64047/64047]

--2024-11-24 20:14:51--  https://github.com/possibly-wrong/skittles/blob/master/images/107.jpg?raw=true
Resolving github.com (github.com)... 140.82.114.4
Connecting to github.com (github.com)|140.82.114.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://github.com/possibly-wrong/skittles/raw/refs/heads/master/images/107.jpg [following]
--2024-11-24 20:14:52--  https://github.com/possibly-wrong/skittles/raw/refs/heads/master/images/107.jpg
Reusing existing connection to github.com:443.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/possibly-wrong/skittles/refs/heads/master/images/107.jpg [following]
--2024-11-24 20:14:52--  https://raw.githubusercontent.com/possibly-wrong/skittles/refs/heads/master/images/107.jpg
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.108.133, 185.199.111.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 61295 (60K) [image/jpeg]
Saving to: ‘skittles107.jpg’

skittles107.jpg     100%[===================>]  59.86K  --.-KB/s    in 0.007s  

2024-11-24 20:14:52 (8.58 MB/s) - ‘skittles107.jpg’ saved [61295/61295]

--2024-11-24 20:14:52--  https://github.com/possibly-wrong/skittles/blob/master/images/108.jpg?raw=true
Resolving github.com (github.com)... 140.82.113.4
Connecting to github.com (github.com)|140.82.113.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://github.com/possibly-wrong/skittles/raw/refs/heads/master/images/108.jpg [following]
--2024-11-24 20:14:52--  https://github.com/possibly-wrong/skittles/raw/refs/heads/master/images/108.jpg
Reusing existing connection to github.com:443.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/possibly-wrong/skittles/refs/heads/master/images/108.jpg [following]
--2024-11-24 20:14:52--  https://raw.githubusercontent.com/possibly-wrong/skittles/refs/heads/master/images/108.jpg
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.108.133, 185.199.109.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 59998 (59K) [image/jpeg]
Saving to: ‘skittles108.jpg’

skittles108.jpg     100%[===================>]  58.59K  --.-KB/s    in 0.008s  

2024-11-24 20:14:53 (7.03 MB/s) - ‘skittles108.jpg’ saved [59998/59998]

--2024-11-24 20:14:53--  https://github.com/possibly-wrong/skittles/blob/master/images/109.jpg?raw=true
Resolving github.com (github.com)... 140.82.114.4
Connecting to github.com (github.com)|140.82.114.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://github.com/possibly-wrong/skittles/raw/refs/heads/master/images/109.jpg [following]
--2024-11-24 20:14:53--  https://github.com/possibly-wrong/skittles/raw/refs/heads/master/images/109.jpg
Reusing existing connection to github.com:443.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/possibly-wrong/skittles/refs/heads/master/images/109.jpg [following]
--2024-11-24 20:14:53--  https://raw.githubusercontent.com/possibly-wrong/skittles/refs/heads/master/images/109.jpg
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 64062 (63K) [image/jpeg]
Saving to: ‘skittles109.jpg’

skittles109.jpg     100%[===================>]  62.56K  --.-KB/s    in 0.01s   

2024-11-24 20:14:53 (4.53 MB/s) - ‘skittles109.jpg’ saved [64062/64062]

--2024-11-24 20:14:54--  https://github.com/possibly-wrong/skittles/blob/master/images/110.jpg?raw=true
Resolving github.com (github.com)... 140.82.114.4
Connecting to github.com (github.com)|140.82.114.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://github.com/possibly-wrong/skittles/raw/refs/heads/master/images/110.jpg [following]
--2024-11-24 20:14:54--  https://github.com/possibly-wrong/skittles/raw/refs/heads/master/images/110.jpg
Reusing existing connection to github.com:443.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/possibly-wrong/skittles/refs/heads/master/images/110.jpg [following]
--2024-11-24 20:14:54--  https://raw.githubusercontent.com/possibly-wrong/skittles/refs/heads/master/images/110.jpg
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 64006 (63K) [image/jpeg]
Saving to: ‘skittles110.jpg’

skittles110.jpg     100%[===================>]  62.51K  --.-KB/s    in 0.02s   

2024-11-24 20:14:54 (4.03 MB/s) - ‘skittles110.jpg’ saved [64006/64006]

Visualization¶

In [3]:
def imshow(a):
    a = a.clip(0, 255).astype("uint8")
    if a.ndim == 3:
        if a.shape[2] == 4:
            a = cv2.cvtColor(a, cv2.COLOR_BGRA2RGBA)
        else:
            a = cv2.cvtColor(a, cv2.COLOR_BGR2RGB)
    display(PIL.Image.fromarray(a))
In [4]:
css = """
<style type="text/css">
  table, td, table.dataframe, table.dataframe td {
    border: 1px solid black;    //border: double;
    border-collapse: collapse;
    border-style: solid;
    border-spacing: 0px;
    background-color: rgb(250,250,250);
    width: 18px;
    height: 18px;
    text-align: center;
    transform: scale(1.0);
    margin: 2px;
    }
</style>
"""


def h(s):
    return display(HTML(css + DataFrame(s).to_html(header=False, index=False)))
In [5]:
def h_color(a, cmap="gray", scale=2):
    s = [a.shape[0] * scale, a.shape[1] * scale]
    plt.figure(figsize=s)
    plt.tick_params(
        axis="both",
        which="both",
        bottom=False,
        top=False,
        labelbottom=False,
        labelleft=False,
        left=False,
        right=False,
    )
    plt.imshow(a, cmap=cmap)
In [6]:
cmap = ListedColormap(
    [
        "black",
        "tomato",
        "chocolate",
        "darkorange",
        "gold",
        "olive",
        "green",
        "deepskyblue",
        "blueviolet",
        "hotpink",
    ]
)


def h_grid(grid, scale=1):
    h_color(grid, cmap, scale)
In [7]:
def pix_show(pixels, skip_each=1, height=800, width=800, colors=None):
    pixels = pixels[::skip_each]
    if colors is None:
        colors = pixels[:, ::-1]
    else:
        colors = colors[::skip_each]
    b, g, r = pixels[:, 0], pixels[:, 1], pixels[:, 2]
    fig = go.Figure(
        data=[
            go.Scatter3d(
                x=b,
                y=g,
                z=r,
                mode="markers",
                marker={"size": 2, "color": colors, "opacity": 0.7},
            )
        ],
        layout_xaxis_range=[0, 1],
        layout_yaxis_range=[0, 1],
    )
    scene = {
        "xaxis": dict(title="Blue"),
        "yaxis": dict(title="Green"),
        "zaxis": dict(title="Red"),
    }
    fig.update_layout(
        autosize=False, height=height, width=width, scene=scene, showlegend=True
    )
    fig.show()

Image segmentation¶

Image segmentation is a crucial technique in computer vision that involves partitioning an image into distinct regions or segments, each representing a meaningful part of the image. By classifying pixels into categories, segmentation helps in understanding the structure and content of an image, making it essential for applications such as object detection, medical imaging, autonomous driving, and image editing.

There are two primary types of image segmentation:

  • Semantic segmentation - assigns a label to every pixel in an image based on its category, ensuring that all pixels belonging to the same object type (e.g., "tree" or "car") are grouped together.
  • Instance segmentation - not only assigns category labels to pixels but also distinguishes between different instances of the same object category.

The main techniques of classical semantic segmentation:

  • Thresholding methods: These techniques segment an image by converting it into binary form using one or more thresholds applied to a specific color space.This separates regions of interest from the background based on intensity or color values.
  • Edge detection-based segmentation: By detecting changes in intensity, these methods identify object boundaries, using edge-detection algorithms like Sobel or Canny to segment images.
  • Region-based segmentation: This approach groups neighboring pixels with similar properties, such as color or intensity, to form cohesive regions, often starting with a seed point and expanding outward.
  • Graph-based segmentation: Images are represented as graphs, where pixels or regions are nodes, and edges define the relationships between them, typically based on similarity in color, intensity, or spatial proximity. The goal is to partition the graph into subsets or segments such that nodes within a segment are highly similar, while nodes across different segments are significantly dissimilar.
  • Segmentation by clustering: This method involves grouping pixels into clusters based on their feature similarity (e.g., color, intensity, or texture), using clustering algorithms. Typically, the arrangement of pixels and their neighborhood relationships are not directly considered, focusing instead on their feature-based attributes.

Segmentation through thresholding¶

The easiest way to segment an image is to perform a pixel intensity thresholding. The thresholding operation consists in replacing all intensities above a certain threshold with a certain constant value, and with another value below this threshold.

There is also segmentation with multiple thresholds, which was presented in the first class.

In [8]:
lena = cv2.imread("./lena_std.tif", cv2.IMREAD_COLOR)
imshow(lena)
No description has been provided for this image
In [9]:
lena_gray = cv2.cvtColor(lena, cv2.COLOR_BGR2GRAY)
imshow(lena_gray)
No description has been provided for this image
In [10]:
lut = np.array([255] * 100 + [0] * 100 + [255] * 56)
lena_lut = cv2.LUT(lena_gray, lut)
imshow(lena_lut)
No description has been provided for this image

The OpenCV library also includes a ready-made implementation of other simple image thresholding approaches. To perform the thresholding operation in OpenCV, the threshold() function should be called, which takes the image, the threshold value, the maximum value and the threshold method that should be used.

The available thresholding methods include:

  • binary - Set to 0 for pixels with intensities below the threshold, the value , for the others maximum value,
  • binary inverted - Set to 0 for pixels with intensities above the threshold, for the remaining pixels maximum value,
  • truncate - Set to the threshold value for pixels with intensities above the threshold; leave others unchanged.
  • to zero - Set to 0 for pixels with intensities below the threshold, the others remain unchanged,
  • to zero inverted - Set to 0 for pixels with intensities above the threshold, the others remain unchanged,
In [11]:
_, lena_bin = cv2.threshold(lena_gray, 127, 255, cv2.THRESH_BINARY)
_, lena_bin_inv = cv2.threshold(lena_gray, 127, 255, cv2.THRESH_BINARY_INV)
_, lena_trunc = cv2.threshold(lena_gray, 127, 255, cv2.THRESH_TRUNC)
_, lena_tozero = cv2.threshold(lena_gray, 127, 255, cv2.THRESH_TOZERO)
_, lena_tozero_inv = cv2.threshold(lena_gray, 127, 255, cv2.THRESH_TOZERO_INV)

imshow(np.concatenate([lena_gray, lena_bin, lena_bin_inv], 1))
imshow(np.concatenate([lena_trunc, lena_tozero, lena_tozero_inv], 1))
No description has been provided for this image
No description has been provided for this image

Sudoku example¶

In [12]:
sudoku = cv2.imread("./sudoku.png", cv2.IMREAD_GRAYSCALE)
imshow(sudoku)
print(sudoku.shape)
No description has been provided for this image
(563, 558)
In [13]:
_, sudoku_bin = cv2.threshold(sudoku, 70, 255, cv2.THRESH_BINARY)
imshow(sudoku_bin)
No description has been provided for this image
In [14]:
lut = np.array([0] * 50 + [255] * 80 + [0] * 126)
sudoku_lut = cv2.LUT(sudoku, lut)
imshow(sudoku_lut)
No description has been provided for this image

OTSU¶

OTSU is an algorithm designed to dynamically determine the threshold value, ensuring that the pixel intensity variance within both resulting classes is minimized, effectively maximizing the inter-class variance.

The minimized function utilized by OTSU is expressed as: $$\sigma^2_w(t) = Q_1\sigma^2_1 + Q_2\sigma^2_2$$ where:

  • $Q_i$ - represents the probability of a pixel belonging to the i-th class. This probability is derived from the cumulative distribution function: $$Q_i = P_i(f(x,y) < t_i)$$
  • $\sigma^2_i$ - denotes the variance within the i-th class.

To identify the optimal threshold value, the objective is to minimize the expression across all possible values (from 0 to 255).

The initial step involves calculating probability density function (PDF) so the probability of a pixel having a specific intensity (ranging from 0 to 255). This can be achieved by determining the histogram and subsequently normalizing it. Additionally, the cumulative distribution function (CDF) will be computed to facilitate the calculation of mean and conditional variances.

In [15]:
def calculate_pdf_cdf(img):
    h = cv2.calcHist([img], [0], None, [256], [0, 256])
    pdf = h.ravel() / h.sum()
    cdf = np.cumsum(pdf)
    return pdf, cdf


pdf, cdf = calculate_pdf_cdf(sudoku)

plt.plot(pdf)
plt.show()

plt.plot(cdf)
plt.show()
No description has been provided for this image
No description has been provided for this image

In a single iteration of the OTSU algorithm, where a specific threshold is set, the algorithm divides the probability distribution into the probabilities of pixel occurrences in both classes. The determination of the mean value involves the use of the formula for the conditional expected value (because we are calculating the average of pixels under the condition that a specific class occurs, expressed as the: ''expected value for a given class'').

$$ E(C_1 | t_1) = \sum_{x =1}^{t_1} \frac{xP(x)}{Q_1}$$

Then, to calculate the (conditional!) variances, we use the previously calculated conditional expected values. $$\sigma^2_1 = \sum_{x =1}^{t_1} \frac{(x - E[C_1|t_1])^2P(x)}{Q_1}$$

In [16]:
def cond_mean(x, p, q):
    # Calculate the conditional mean for a given class
    return np.sum(x * p / q)


def otsu_one_step(x, pdf, cdf, epsilon=1e-6):
    """
    Calculates one step of the ostu algorithm for one threshold value x.
    """
    # Split intensity values into lower and higher than the threshold
    x_low, x_high = np.arange(x), np.arange(x, 256)
    # Split probability densities into lower and higher than the threshold
    p_low, p_high = pdf[:x], pdf[x:]
    # Calculate cumulative probabilities for lower and higher classes
    q_low, q_high = cdf[x], cdf[-1] - cdf[x]

    # Check for small cumulative probabilities to avoid division by zero
    if q_low < epsilon or q_high < epsilon:
        return None

    # Calculate conditional means for lower and higher classes
    m_low, m_high = cond_mean(x_low, p_low, q_low), cond_mean(x_high, p_high, q_high)
    # Calculate conditional variances for lower and higher classes
    s_low  = cond_mean((x_low - m_low) ** 2, p_low, q_low)
    s_high = cond_mean((x_high - m_high) ** 2, p_high, q_high)

    # Combine conditional variances to optimize Otsu's method
    return s_low * q_low + s_high * q_high

In the most basic version, the OTSU algorithm, we iterating over all possible divisions (threshold from 1 to 254) and selecting the one for which the previously presented objective function returns the smallest value.

In [17]:
def otsu(img):
    pdf, cdf = calculate_pdf_cdf(img)
    v_min = None
    threshold = 0

    for i in range(1, 254):
        v = otsu_one_step(i, pdf, cdf)
        if v is not None and (v_min is None or v < v_min):
            v_min = v
            threshold = i + 1

    _, img_otsu = cv2.threshold(img, threshold, 255, cv2.THRESH_BINARY)

    return threshold, img_otsu

In order to use the OSTU method in opencv, just add THRESH_OTSU to the thresholding operation.

In [18]:
th_self, sudoku_otsu_self = otsu(sudoku)
th_auto, sudoku_otsu_auto = cv2.threshold(
    sudoku, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU
)

print(
    "Found value of thresholding with the OTSU algorithm (custom implementation):",
    th_self,
)
print("Found value of thresholding with the OTSU algorithm (OpenCV):", th_auto)

print("\nOTSU (custom implementation):")
imshow(sudoku_otsu_self)
print("\nOTSU (OpenCV):")
imshow(sudoku_otsu_auto)
Found value of thresholding with the OTSU algorithm (custom implementation): 96
Found value of thresholding with the OTSU algorithm (OpenCV): 96.0

OTSU (custom implementation):
No description has been provided for this image
OTSU (OpenCV):
No description has been provided for this image

Adaptive methods¶

Among the thresholding methods, there are also adaptive methods. These are thresholding methods that adjust the threshold value depending on the image content.

Adaptive thresholding methods often work very well when the input image is divided into smaller areas and the threshold value is adjusted separately for each area. The motivation behind such a mechanism is the fact that in real images the lighting (as well as focus, balance, etc.) is uneven.

  • ADAPTIVE_THRESH_MEAN_C - The threshold value corresponds to the average of the neighborhood area of size blockSize, subtracted by a constant.
  • ADAPTIVE_THRESH_GAUSSIAN_C - The threshold value corresponds to the Gausian-weighted sum of the neighborhood area of size blockSize, subtracted by a constant.
In [18]:
 
In [19]:
blockSize = 101
constant = 2
lena_ad_mean = cv2.adaptiveThreshold(
    lena_gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, blockSize, constant
)
lena_ad_gauss = cv2.adaptiveThreshold(
    lena_gray,
    255,
    cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
    cv2.THRESH_BINARY,
    blockSize,
    constant,
)

imshow(np.concatenate([lena_ad_mean, lena_ad_gauss], 1))
No description has been provided for this image
In [20]:
blockSize = 3
constant = 2
sudoku_ad_mean = cv2.adaptiveThreshold(
    sudoku, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, blockSize, constant
)
sudoku_ad_gauss = cv2.adaptiveThreshold(
    sudoku,
    255,
    cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
    cv2.THRESH_BINARY,
    blockSize,
    constant,
)

imshow(np.concatenate([sudoku_ad_mean, sudoku_ad_gauss], 1))
No description has been provided for this image

Segmentation of multi-channel images¶

Segmenting multi-channel images, such as RGB, using straightforward thresholding methods becomes challenging due to the necessity of defining thresholds in N-dimensional space. As a result, cluster analysis methods are frequently employed instead of simplistic thresholding for the segmentation of multi-channel images.

Cluster analysis involves identifying clusters of pixels within a specific space, sometimes directly in the intensity space. This process entails creating distinct pixel classes based on their characteristics within that space.

To illustrate, let's conduct a basic pixel intensity analysis on the given image below.

In [21]:
graf = cv2.imread("./graf.png", cv2.IMREAD_COLOR)
graf = cv2.resize(graf, None, fx=0.75, fy=0.75)
imshow(graf)
No description has been provided for this image

The image presented as a list of pixels (BGR) was displayed in 3D space, where the coordinates of a given pixel are its intensity values. Additionally, the pixels have been colored according to their intensities.

In [22]:
graf_pixels = graf.reshape([-1, 3])
pix_show(graf_pixels, 16)

From the analysis of the above visualization, a few conclusions can be drawn:

  • most pixels lie on a straight line between black (0, 0, 0) and white (255, 255, 255),
  • a few clusters for colors can be distinguished:
    • Red,
    • blue,
    • gold / orange,,
    • green,
    • claret,
    • dark purple,

One method of splitting spaces into clusters is Gaussian Mixture, which approximates the distribution of clusters using N Gaussian distributions. It is a method with training parameters, so there is a need for a certain sample of data to which we could fit a mathematical model.

We will use a pixel list (BGR) as the data to which we will fit the model, and then assign each of them a number of Gaussian systems to which it belongs with the highest probability.

In [23]:
# model initialization and training
model = GaussianMixture(n_components=8)
model.fit(graf_pixels)

# assigning classes to pixels
segments = model.predict(graf_pixels)
print(segments)
[1 1 1 ... 4 4 4]

The next step will be to calculate the average color for each class (segment) and display the pixels again with the colors representing the segmentation.

In [24]:
segments_colors = np.stack([graf_pixels[segments == i].mean(0) for i in range(8)], 0)
colors = np.take(segments_colors, segments, 0)

pix_show(graf_pixels, 16, colors=colors[:, ::-1])

Pixels with mapped classes are the final image segmentation based on cluster analysis. The input image and the segmentation effect are shown below.

In [25]:
segmented = colors.reshape(graf.shape)
imshow(np.concatenate([graf, segmented], 1))
No description has been provided for this image

Segmentation by edge detection¶

Segmentation through edge detection is based on the knowledge learned in the previous class as part of detecting key points, corners and edges.

The idea is to divide the image on the basis of edges, and then fill closed areas, assigning subsequent identifiers to subsequent separable areas.

The assignment of identifier areas is presented in the next section.

In [26]:
clevr = cv2.imread("./clevr.jpg", cv2.IMREAD_COLOR)
clevr = cv2.resize(clevr, None, fx=0.5, fy=0.5)
imshow(clevr)
No description has been provided for this image
In [27]:
clevr_gray = cv2.cvtColor(clevr, cv2.COLOR_BGR2GRAY)
imshow(clevr_gray)
No description has been provided for this image
In [28]:
canny_high, _ = cv2.threshold(clevr_gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
canny_low = 0.5 * canny_high

clevr_canny = cv2.Canny(clevr_gray, canny_low, canny_high, 9)
clevr_canny = cv2.morphologyEx(
    clevr_canny, cv2.MORPH_CLOSE, kernel=np.ones((3, 3), np.uint8)
)
imshow(clevr_canny)
No description has been provided for this image

Segmentation by region growing¶

Segmentation by region growing consists in iteratingly joining adjacent areas until a certain condition is met. The areas are joined after meeting the uniformity test, while the algorithm is executed until the stop condition is met.

Uniformity test - combining areas and checking a certain condition. An example condition might be: the average difference of pixel intensity in both areas. If it is greater than a certain threshold, then the areas are not uniform and there is no connection.

Stop condition - this condition can be treated as no further merges of areas or as a condition for early stopping the algorithm (e.g. when we want the areas not to be larger than the set limit.

For region growing, previously detected edges can be used as region boundaries, so the stop condition means that all pixels inside the area surrounded by edges belong to the region.

In [29]:
regions = np.zeros(clevr_canny.shape[:2], np.int32)
neighbours = [(-1, 0), (1, 0), (0, -1), (0, 1)]


def find_neighbours(img, y, x):
    c_neighbours = []
    for dy, dx in neighbours:
        ny, nx = y + dy, x + dx

        if ny < 0 or ny >= img.shape[0] or nx < 0 or nx >= img.shape[1]:
            continue

        if regions[ny, nx] > 0:
            continue

        if img[ny, nx] == 255:
            continue

        if img[y, x] == img[ny, nx]: # Uniformity test
            c_neighbours.append((ny, nx))

    return c_neighbours


def grow_region(img, y, x, cls):
    regions[y, x] = cls

    c_neighbours = find_neighbours(img, y, x)
    for ny, nx in c_neighbours:
        regions[ny, nx] = cls

    while len(c_neighbours) > 0:
        new_neighbours = []
        for ny, nx in c_neighbours:
            i_new_neighbours = find_neighbours(img, ny, nx)
            for _ny, _nz in i_new_neighbours:
                regions[_ny, _nz] = cls

            new_neighbours.extend(i_new_neighbours)

        c_neighbours = new_neighbours


i = 1
for y in range(clevr_canny.shape[0]):
    for x in range(clevr_canny.shape[1]):
        if regions[y, x] == 0 and clevr_canny[y, x] == 0:
            grow_region(clevr_canny, y, x, i)
            i += 1

mean_colors = np.stack(
    [
        np.array([255, 255, 255]) if j == 0 else clevr[regions == j].mean(0)
        for j in range(i)
    ],
    0,
)
regions_colors = np.take(mean_colors, regions, 0)

imshow(regions_colors)
No description has been provided for this image

In this case (detected edges), a similar result can also be obtained using connected component.

In [30]:
kernel = np.ones((3, 3), np.uint8)
edges = cv2.dilate(clevr_canny, kernel, iterations=1)
ret, markers = cv2.connectedComponents(255-edges)

mean_colors = np.stack(
    [
        np.array([255, 255, 255]) if j == 0 else clevr[regions == j].mean(0)
        for j in range(ret)
    ],
    0,
)
regions_colors = np.take(mean_colors, regions, 0)

imshow(regions_colors)
<ipython-input-30-e239ae104b96>:7: RuntimeWarning:

Mean of empty slice.

/usr/local/lib/python3.10/dist-packages/numpy/core/_methods.py:121: RuntimeWarning:

invalid value encountered in divide

No description has been provided for this image

Tasks¶

Task 1¶

Like the section on multi-channel image segmentation, perform the same pixel intensity cluster analysis for the './skittles100.jpg' image and then segment the image using the K-Means algorithm (available, among others, in the scikit library: sklearn.cluster.KMeans) .

Present the intermediate results:

  • BGR input image
  • BGR pixels in 3D space,
  • segmentation result on BGR pixels in 3D space,
  • segmentation result as a 2D image (BGR)

BGR input image¶

In [31]:
skittles = cv2.imread("./skittles100.jpg", cv2.IMREAD_COLOR)
skittles = cv2.resize(skittles, None, fx=0.75, fy=0.75)
imshow(skittles)
No description has been provided for this image

BGR pixels in 3D space¶

In [32]:
skittles_pixels = skittles.reshape([-1, 3])
pix_show(skittles_pixels, 16)

Segmentation result on BGR pixels in 3D space¶

In [33]:
number_colors = 10
In [34]:
from sklearn.cluster import KMeans

model = KMeans(n_clusters = number_colors, random_state = 0, n_init='auto')
model.fit(skittles_pixels)

segments = model.predict(skittles_pixels)
print(segments)
[7 7 7 ... 8 8 8]
In [35]:
segments_colors = np.stack([skittles_pixels[segments == i].mean(0) for i in range(number_colors)], 0)
colors = np.take(segments_colors, segments, 0)

pix_show(skittles_pixels, 16, colors=colors[:, ::-1])

Segmentation result as a 2D image (BGR)¶

In [36]:
segmented_skittles = colors.reshape(skittles.shape)
imshow(np.concatenate([skittles, segmented_skittles], 1))
No description has been provided for this image

Task 2¶

Using the methods you learned in the previous class, find the number of Skittles in the image './skittles100.jpg'. (it is not necessary to use the solution from task 1) Show intermediate results and describe the processing steps in the comment.

Show the original image with founded individual skittles marked on it.

based on:

https://www.youtube.com/watch?v=_Coth4YESzk

https://www.youtube.com/watch?v=rRcwuQGDFOA

I started by doing the segmentation that was done above but on an image after median blur, but there was an issu with the purple collor and gray bacground. So I found out by an accident that HSV helps with that (so the image is converted to it at the beggining).

I changed the 100 to other to fix some issues

In [59]:
skittles_bgr = cv2.imread("./skittles100.jpg", cv2.IMREAD_COLOR)
skittles = cv2.cvtColor(skittles_bgr, cv2.COLOR_BGR2HSV)
skittles = cv2.resize(skittles, None, fx=0.75, fy=0.75)

kernel_size = 9
median_filtered_skittles = cv2.medianBlur(skittles, kernel_size)
# imshow(np.concatenate([skittles, median_filtered_skittles],1))

skittles_pixels = median_filtered_skittles.reshape([-1, 3])
# pix_show(skittles_pixels, 16)

number_colors = 32
model = KMeans(n_clusters = number_colors, random_state = 0, n_init='auto')
model.fit(skittles_pixels)
segments = model.predict(skittles_pixels)
segments_colors = np.stack([skittles_pixels[segments == i].mean(0) for i in range(number_colors)], 0)
colors = np.take(segments_colors, segments, 0)
pix_show(skittles_pixels, 16, colors=colors[:, ::-1])

segmented_skittles = colors.reshape(skittles.shape)
imshow(np.concatenate([median_filtered_skittles, segmented_skittles], 1))
No description has been provided for this image

I did select by hand the classes/segments containing the background

In [60]:
for segment in range (0,32):
  print(segment)
  segments_colors1 = np.stack(
      [skittles_pixels[segments == i].mean(0) if i == segment else np.full_like(skittles_pixels[0], 255) for i in range(number_colors)], 0
  )
  colors1 = np.take(segments_colors1, segments, 0)
  # pix_show(skittles_pixels, 16, colors=colors1[:, ::-1])

  segmented_skittles1 = colors1.reshape(median_filtered_skittles.shape)
  imshow(np.concatenate([median_filtered_skittles, segmented_skittles1], 1))
0
No description has been provided for this image
1
No description has been provided for this image
2
No description has been provided for this image
3
No description has been provided for this image
4
No description has been provided for this image
5
No description has been provided for this image
6
No description has been provided for this image
7
No description has been provided for this image
8
No description has been provided for this image
9
No description has been provided for this image
10
No description has been provided for this image
11
No description has been provided for this image
12
No description has been provided for this image
13
No description has been provided for this image
14
No description has been provided for this image
15
No description has been provided for this image
16
No description has been provided for this image
17
No description has been provided for this image
18
No description has been provided for this image
19
No description has been provided for this image
20
No description has been provided for this image
21
No description has been provided for this image
22
No description has been provided for this image
23
No description has been provided for this image
24
No description has been provided for this image
25
No description has been provided for this image
26
No description has been provided for this image
27
No description has been provided for this image
28
No description has been provided for this image
29
No description has been provided for this image
30
No description has been provided for this image
31
No description has been provided for this image

After selecting the background i did something like a mask that is black where the background is and white where skitles.

In [61]:
segments_colors1 = np.stack(
    [np.full_like(skittles_pixels[0], 0) if i in [1, 4, 8, 9, 10, 15, 16, 17, 20, 21, 23, 29, 30, 31] else np.full_like(skittles_pixels[0], 255) for i in range(number_colors)], 0
)
colors1 = np.take(segments_colors1, segments, 0)
# pix_show(skittles_pixels, 16, colors=colors1[:, ::-1])

segmented_skittles1 = colors1.reshape(skittles.shape)
imshow(np.concatenate([skittles, segmented_skittles1], 1))
No description has been provided for this image

There still were some dots so i dilated to get rid of the black ones and then eroded twice to shrink and separate the skittles (the kernels were selected by try and error). Then i used canny to find contures and dilated to make the contures more visible.

In [108]:
dilated = cv2.dilate(segmented_skittles1.copy(), np.ones((5,5), np.uint8), iterations=1)
eroded = cv2.erode(dilated, np.ones((25,25), np.uint8), iterations=1)
eroded2 = cv2.erode(eroded, np.ones((11,11), np.uint8), iterations=1)
dilated2 = cv2.dilate(eroded2, np.ones((5,5), np.uint8), iterations=3)
canny = cv2.Canny(cv2.GaussianBlur(dilated2, (21,21), 0), 100, 200)
dilated3 = cv2.dilate(canny.copy(), np.ones((3,3), np.uint8), iterations=1)

imshow(dilated)
imshow(eroded)
imshow(eroded2)
imshow(dilated2)
imshow(canny)
imshow(dilated3)
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image

And found contures. There were few contures per one candy so i filtered ones that were too close to each other.

In [109]:
contures, hierarchy = cv2.findContours(canny.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
print(len(contures))
135
In [118]:
sorted_contures = sorted(contures, key=cv2.contourArea, reverse=True)
In [119]:
canny_contures = cv2.drawContours(segmented_skittles1.copy(), contures, -1, (0,255,0), 3)

unique_contours = []

for i, count in enumerate(sorted_contures):
    # print(i)
    x, y, w, h = cv2.boundingRect(count)
    # print(x, y, w, h)

    area_outer = cv2.contourArea(count)
    print(area_outer)
    if area_outer < 2.0:
        continue

    center_outer = (x + w // 2, y + h // 2)
    radius_outer = max(w, h) // 2
    is_inside = False

    # X = x + int(w/2)
    # Y = y + int(h/2)

    for inner in unique_contours:
        x_inner, y_inner, w_inner, h_inner = cv2.boundingRect(inner)
        center_inner = (x_inner + w_inner // 2, y_inner + h_inner // 2)
        radius_inner = max(w_inner, h_inner) // 2

        distance = np.sqrt((center_outer[0] - center_inner[0])**2 + (center_outer[1] - center_inner[1])**2)

        if distance <= 35:
            is_inside = True
            break

    if not is_inside:
        unique_contours.append(count)
        cv2.circle(canny_contures, center_outer, 20, (0,0,255), 2)

    # cv2.circle(canny_contures, (X,Y), 20, (0,0,255), 2)

imshow(canny_contures)
print(len(unique_contours))
490.5
403.5
399.5
384.0
380.5
361.0
317.0
314.5
313.5
296.5
286.0
281.0
271.5
270.0
256.0
247.0
213.0
202.0
15.5
14.0
13.5
13.5
12.5
12.0
12.0
11.5
10.5
10.5
10.5
10.0
9.5
9.5
9.5
9.0
8.5
8.5
7.5
7.5
7.5
7.0
7.0
5.5
5.5
5.5
5.0
5.0
5.0
4.5
4.5
4.5
4.5
4.0
4.0
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.5
3.0
3.0
3.0
3.0
2.5
2.5
2.5
2.5
2.5
2.5
2.5
2.5
2.5
2.5
2.0
2.0
2.0
2.0
2.0
2.0
1.5
1.5
1.5
1.5
1.5
0.5
0.5
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
No description has been provided for this image
60

I put all besides the model generation in a function. The same knn model is used for all cases.

In [133]:
def find_skittles(img, model):
    skittles = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
    skittles = cv2.resize(skittles, None, fx=0.75, fy=0.75)

    kernel_size = 9
    median_filtered_skittles = cv2.medianBlur(skittles, kernel_size)
    # imshow(np.concatenate([skittles, median_filtered_skittles],1))

    skittles_pixels = median_filtered_skittles.reshape([-1, 3])
    # pix_show(skittles_pixels, 16)

    segments = model.predict(skittles_pixels)
    segments_colors = np.stack([skittles_pixels[segments == i].mean(0) for i in range(number_colors)], 0)
    colors = np.take(segments_colors, segments, 0)
    # pix_show(skittles_pixels, 16, colors=colors[:, ::-1])

    segmented_skittles = colors.reshape(skittles.shape)
    # imshow(np.concatenate([median_filtered_skittles, segmented_skittles], 1))

    segments_colors1 = np.stack(
        [np.full_like(skittles_pixels[0], 0) if i in [1, 4, 8, 9, 10, 15, 16, 17, 20, 21, 23, 29, 30, 31] else np.full_like(skittles_pixels[0], 255) for i in range(number_colors)], 0
    )
    colors1 = np.take(segments_colors1, segments, 0)
    # pix_show(skittles_pixels, 16, colors=colors1[:, ::-1])

    segmented_skittles1 = colors1.reshape(skittles.shape)
    # imshow(np.concatenate([skittles, segmented_skittles1], 1))


    dilated = cv2.dilate(segmented_skittles1.copy(), np.ones((5,5), np.uint8), iterations=1)
    eroded = cv2.erode(dilated, np.ones((25,25), np.uint8), iterations=1)
    eroded2 = cv2.erode(eroded, np.ones((11,11), np.uint8), iterations=1)
    dilated2 = cv2.dilate(eroded2, np.ones((5,5), np.uint8), iterations=3)
    canny = cv2.Canny(cv2.GaussianBlur(dilated2, (21,21), 0), 100, 200)
    dilated3 = cv2.dilate(canny.copy(), np.ones((3,3), np.uint8), iterations=1)

    # imshow(dilated)
    # imshow(eroded)
    # imshow(eroded2)
    # imshow(dilated2)
    # imshow(canny)
    # imshow(dilated3)

    contures, hierarchy = cv2.findContours(canny.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
    sorted_contures = sorted(contures, key=cv2.contourArea, reverse=True)


    canny_contures = cv2.drawContours(segmented_skittles1.copy(), contures, -1, (0,255,0), 3)

    unique_contours = []
    final_ing = img.copy()

    final_ing = cv2.resize(final_ing, None, fx=0.75, fy=0.75)

    for i, count in enumerate(sorted_contures):
        # print(i)
        x, y, w, h = cv2.boundingRect(count)
        # print(x, y, w, h)

        area_outer = cv2.contourArea(count)
        if area_outer < 2.0:
            continue

        center_outer = (x + w // 2, y + h // 2)
        radius_outer = max(w, h) // 2
        is_inside = False

        # X = x + int(w/2)
        # Y = y + int(h/2)

        for inner in unique_contours:
            x_inner, y_inner, w_inner, h_inner = cv2.boundingRect(inner)
            center_inner = (x_inner + w_inner // 2, y_inner + h_inner // 2)
            radius_inner = max(w_inner, h_inner) // 2

            distance = np.sqrt((center_outer[0] - center_inner[0])**2 + (center_outer[1] - center_inner[1])**2)

            if distance <= 35 :
                is_inside = True
                break

        if not is_inside:
            unique_contours.append(count)
            cv2.circle(final_ing, center_outer, 20, (0,0,255), 2)

        # cv2.circle(canny_contures, (X,Y), 20, (0,0,255), 2)

    imshow(np.concatenate([final_ing, canny_contures], 1))
    print(len(unique_contours))
In [134]:
find_skittles(cv2.imread("./skittles104.jpg", cv2.IMREAD_COLOR), model)
No description has been provided for this image
58

Task 3¶

  1. Test the solution from task 2 for the remaining skittels images.
  2. Improve the solution so that it works properly for this images.
In [135]:
for file in glob.glob("./skittles*"):
    print(file)
    skittles2 = cv2.imread(file, cv2.IMREAD_COLOR)
    # imshow(cv2.resize(skittles2, None, fx=0.4, fy=0.4))
    find_skittles(skittles2, model)

# TODO
./skittles104.jpg
No description has been provided for this image
58
./skittles108.jpg
No description has been provided for this image
56
./skittles109.jpg
No description has been provided for this image
57
./skittles110.jpg
No description has been provided for this image
62
./skittles106.jpg
No description has been provided for this image
58
./skittles101.jpg
No description has been provided for this image
55
./skittles105.jpg
No description has been provided for this image
59
./skittles100.jpg
No description has been provided for this image
60
./skittles102.jpg
No description has been provided for this image
60
./skittles103.jpg
No description has been provided for this image
57
./skittles107.jpg
No description has been provided for this image
57

The outcome is sometimes correct but sometimes there is 1 or 3 scittles missing.

It mostly strugles coz of a black dot in the middle of the candy (but sometimes it suddenly is not a problem). I am reducing the white and black noice thanks to erosion and dialation but when trying to remove some bigger noise we also remove candy. So the current version just works the best for now.

It could be futher improved by for example treating the big contures as 2 candies and creating two circles in its place.